AITopics | Dunedin

--The Artificial Intelligence field has focused on developing optimisation methods to solve multiple problems, specifically problems that we thought to be only solvable through cognition. The obtained results have been outstanding, being able to even surpass the T uring T est. However, we have found that these optimisation methods share some fundamental flaws that impede them to become a true artificial cognition. Specifically, the field have identified catastrophic forgetting as a fundamental problem to develop such cognition. This paper formally proves that this problem is inherent to optimisation methods, and as such it will always limit approaches that try to solve the Artificial General Intelligence problem as an optimisation problem. Additionally, it addresses the problem of overfitting and discuss about other smaller problems that optimisation methods pose. Finally, it empirically shows how world-modelling methods avoid suffering from either problem. As a conclusion, the field of Artificial Intelligence needs to look outside the machine learning field to find methods capable of developing an artificial cognition. HERE is a common goal in the Artificial Intelligence field: approaching the achievement of an artificial cognition by producing results similar to those produced by a natural cognition (i.e. a human). That is, the efforts in such field have been focused on mimicking the effects of cognition. This approach has produced a plethora of optimisation methods that try to solve problems that are considered solvable only by humans. The underlying assumption was that, if some algorithm is able to solve these problems, it will be due to the emergence of cognition (or at least some kind of cognition-like reasoning).

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2507.03045

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Wisconsin (0.05)
Asia > Thailand > Bangkok > Bangkok (0.04)
(22 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology (0.68)
Health & Medicine > Therapeutic Area (0.48)
Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Classifying States of the Hopfield Network with Improved Accuracy, Generalization, and Interpretability

McAlister, Hayden, Robins, Anthony, Szymanski, Lech

arXiv.org Artificial IntelligenceMar-4-2025

We extend the existing work on Hopfield network state classification, employing more complex models that remain interpretable, such as densely-connected feed-forward deep neural networks and support vector machines. The states of the Hopfield network can be grouped into several classes, including learned (those presented during training), spurious (stable states that were not learned), and prototype (stable states that were not learned but are representative for a subset of learned states). It is often useful to determine to what class a given state belongs to; for example to ignore spurious states when retrieving from the network. Previous research has approached the state classification task with simple linear methods, most notably the stability ratio. We deepen the research on classifying states from prototype-regime Hopfield networks, investigating how varying the factors strengthening prototypes influences the state classification task. We study the generalizability of different classification models when trained on states derived from different prototype tasks -- for example, can a network trained on a Hopfield network with 10 prototypes classify states from a network with 20 prototypes? We find that simple models often outperform the stability ratio while remaining interpretable. These models require surprisingly little training data and generalize exceptionally well to states generated by a range of Hopfield networks, even those that were trained on exceedingly different datasets.

energy profile, hopfield network, prototype, (14 more...)

arXiv.org Artificial Intelligence

2503.03018

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Oceania > New Zealand > South Island > Otago > Dunedin (0.04)
North America > United States > New York > New York County > New York City (0.04)
(6 more...)

Genre: Research Report (0.64)

Industry: Energy (0.36)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Continuous Integration Practices in Machine Learning Projects: The Practitioners` Perspective

Bernardo, João Helis, da Costa, Daniel Alencar, Cogo, Filipe Roseiro, de Medeiros, Sérgio Queiróz, Kulesza, Uirá

arXiv.org Artificial IntelligenceFeb-24-2025

Continuous Integration (CI) is a cornerstone of modern software development. However, while widely adopted in traditional software projects, applying CI practices to Machine Learning (ML) projects presents distinctive characteristics. For example, our previous work revealed that ML projects often experience longer build durations and lower test coverage rates compared to their non-ML counterparts. Building on these quantitative findings, this study surveys 155 practitioners from 47 ML projects to investigate the underlying reasons for these distinctive characteristics through a qualitative perspective. Practitioners highlighted eight key differences, including test complexity, infrastructure requirements, and build duration and stability. Common challenges mentioned by practitioners include higher project complexity, model training demands, extensive data handling, increased computational resource needs, and dependency management, all contributing to extended build durations. Furthermore, ML systems' non-deterministic nature, data dependencies, and computational constraints were identified as significant barriers to effective testing. The key takeaway from this study is that while foundational CI principles remain valuable, ML projects require tailored approaches to address their unique challenges. To bridge this gap, we propose a set of ML-specific CI practices, including tracking model performance metrics and prioritizing test execution within CI pipelines. Additionally, our findings highlight the importance of fostering interdisciplinary collaboration to strengthen the testing culture in ML projects. By bridging quantitative findings with practitioners' insights, this study provides a deeper understanding of the interplay between CI practices and the unique demands of ML projects, laying the groundwork for more efficient and robust CI strategies in this domain.

build duration, ml project, participant, (10 more...)

arXiv.org Artificial Intelligence

2502.17378

Country:

South America > Brazil > Rio Grande do Norte > Natal (0.04)
Oceania > New Zealand > South Island > Otago > Dunedin (0.04)
North America > Canada > Ontario > Kingston (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Software Engineering (1.00)
Information Technology > Software (1.00)
Information Technology > Data Science (1.00)
(2 more...)

Add feedback

Irregularity-Informed Time Series Analysis: Adaptive Modelling of Spatial and Temporal Dynamics

Zheng, Liangwei Nathan, Li, Zhengyang, Dong, Chang George, Zhang, Wei Emma, Yue, Lin, Xu, Miao, Maennel, Olaf, Chen, Weitong

arXiv.org Artificial IntelligenceOct-16-2024

Irregular Time Series Data (IRTS) has shown increasing prevalence in real-world applications. We observed that IRTS can be divided into two specialized types: Natural Irregular Time Series (NIRTS) and Accidental Irregular Time Series (AIRTS). Various existing methods either ignore the impacts of irregular patterns or statically learn the irregular dynamics of NIRTS and AIRTS data and suffer from limited data availability due to the sparsity of IRTS. We proposed a novel transformer-based framework for general irregular time series data that treats IRTS from four views: Locality, Time, Spatio and Irregularity to motivate the data usage to the highest potential. Moreover, we design a sophisticated irregularity-gate mechanism to adaptively select task-relevant information from irregularity, which improves the generalization ability to various IRTS data. We implement extensive experiments to demonstrate the resistance of our work to three highly missing ratio datasets (88.4\%, 94.9\%, 60\% missing value) and investigate the significance of the irregularity information for both NIRTS and AIRTS by additional ablation study. We release our implementation in https://github.com/IcurasLW/MTSFormer-Irregular_Time_Series.git

artificial intelligence, information, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2410.12257

Country:

Oceania > New Zealand > South Island > Otago > Dunedin (0.04)
Oceania > Australia > Queensland (0.04)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
Asia (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area (0.46)
Health & Medicine > Health Care Technology (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.82)

Add feedback

Task-Adaptive Pretrained Language Models via Clustered-Importance Sampling

Grangier, David, Fan, Simin, Seto, Skyler, Ablin, Pierre

arXiv.org Artificial IntelligenceSep-30-2024

Specialist language models (LMs) focus on a specific task or domain on which they often outperform generalist LMs of the same size. However, the specialist data needed to pretrain these models is only available in limited amount for most tasks. In this work, we build specialist models from large generalist training sets instead. We adjust the training distribution of the generalist data with guidance from the limited domain-specific data. We explore several approaches, with clustered importance sampling standing out. This method clusters the generalist dataset and samples from these clusters based on their frequencies in the smaller specialist dataset. It is scalable, suitable for pretraining and continued pretraining, it works well in multi-task settings. Our findings demonstrate improvements across different domains in terms of language modeling perplexity and accuracy on multiple-choice question tasks. We also present ablation studies that examine the impact of dataset sizes, clustering configurations, and model sizes. Generalist language models (LMs) can address a wide variety of tasks, but this generality comes at a cost (Brown et al., 2020). It necessitates a large training set representative of all prospective tasks, as well as a large model to fit such a comprehensive dataset.

computational linguistic, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2410.03735

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Germany > Berlin (0.04)
South America > Colombia > Meta Department > Villavicencio (0.04)
(12 more...)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)

Add feedback

Machine Learning for Raman Spectroscopy-based Cyber-Marine Fish Biochemical Composition Analysis

Zhou, Yun, Chen, Gang, Xue, Bing, Zhang, Mengjie, Rooney, Jeremy S., Lagutin, Kirill, MacKenzie, Andrew, Gordon, Keith C., Killeen, Daniel P.

arXiv.org Artificial IntelligenceSep-29-2024

The rapid and accurate detection of biochemical compositions in fish is a crucial real-world task that facilitates optimal utilization and extraction of high-value products in the seafood industry. Raman spectroscopy provides a promising solution for quickly and non-destructively analyzing the biochemical composition of fish by associating Raman spectra with biochemical reference data using machine learning regression models. This paper investigates different regression models to address this task and proposes a new design of Convolutional Neural Networks (CNNs) for jointly predicting water, protein, and lipids yield. To the best of our knowledge, we are the first to conduct a successful study employing CNNs to analyze the biochemical composition of fish based on a very small Raman spectroscopic dataset. Our approach combines a tailored CNN architecture with the comprehensive data preparation procedure, effectively mitigating the challenges posed by extreme data scarcity. The results demonstrate that our CNN can significantly outperform two state-of-the-art CNN models and multiple traditional machine learning models, paving the way for accurate and automated analysis of fish biochemical composition.

data augmentation, dataset, snv 2nd, (12 more...)

arXiv.org Artificial Intelligence

2409.19688

Country:

Oceania > New Zealand > South Island > Otago > Dunedin (0.04)
Oceania > New Zealand > North Island > Wellington Region > Wellington (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Food & Agriculture > Fishing (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback